


[Fandom stats] Use of the #sherlocklives tag on Tumblr

by toastystats (destinationtoast)



Series: Fandom Stats [18]
Category: Sherlock (TV)
Genre: Cross-Posted on Tumblr, Fanwork Research & Reference Guides, Meta, Nonfiction
Language: English
Status: Completed
Published: 2013-12-22
Updated: 2013-12-22
Packaged: 2019-09-20 00:39:13
Rating: General Audiences
Warnings: No Archive Warnings Apply
Chapters: 1
Words: 1,476
Publisher: archiveofourown.org
Story URL: https://archiveofourown.org/works/17012217
Author URL: https://archiveofourown.org/users/destinationtoast/pseuds/toastystats
Summary: In the lead up to S3, many Sherlock fans on Tumblr used the #sherlocklives tag.  This is an analysis of how & when the tag was used.





	[Fandom stats] Use of the #sherlocklives tag on Tumblr

**Author's Note:**

> Originally [posted on Tumblr](http://destinationtoast.tumblr.com/post/70837522290/toastystats-how-are-people-using-the). Sorry for some of the images below being low-resolution and/or having really tiny font. I know the labels on one graph are basically illegible; there's a screenshot below of the spreadsheet used to generate it that is hopefully a bit easier to read.

For the latest round of fandom stats, I tried something different from usual -- I tried to analyze fandom data off of Tumblr instead of AO3.  The Three Patch Podcast folks had the great idea to look at how the #sherlocklives tag has been used in the lead up to S3. (The TPP crew has been encouraging folks to get creative with posting their own #sherlocklives contributions with the [Sherlock Fans Around the World](http://threepatchpodcast.tumblr.com/post/68003538902/lets-show-the-bbc-that-fans-around-the-world-are) project -- feel free to join in!  Be careful when exploring the tag, though, as other folks have been posting spoilers.)

**TL;DR: SUMMARY**

This is very experimental -- more than usual -- and there may be large chunks of data missing (as I discuss below the cut).  But here's what I think I've learned (as of December 15, when I did this analysis):

  * **Amount: The #SherlockLives tag has been used at least 4570 times.**  Mostly following the November 23 release by the BBC of the Sherlock S3 teaser trailer, which featured the hashtag, but the tag also saw a handful of uses before that (at least 16 posts), stretching back to January 2012.
  * **Post type: mostly photo and text posts.**  almost all the posts are photo or text, with text posts making up 47% and photo posts 41%.  This is similar to the makeup of the #sherlock tag.
  * **Co-ocurring tags: lots of #sherlock and #benedict cumberbatch; less #johnlock.**   #sherlock is used on 70% of the #sherlocklives posts; after that, there's a sharp dropoff in co-occurring tags.  Other tags that occur on over 10% of the #sherlocklives posts are #bbc sherlock, #benedict cumberbatch, and #r3spects. (see notes below -- for the purposes of this analysis, I omitted whitespaces.)  There are far fewer ship tags on the #sherlocklives posts than the #sherlock posts in general.  Also, over 1% of the #sherlocklives posts are tagged #three patch podcast!  :)
  * **Timing: sharp peaks of usage.** Datewise, there have been three sharp peaks in activity: Nov 23, Nov 29, and Dec 8.  These correspond with announcements by the BBC.  There appear to be a couple times of day that have had a lot more tag activity than most, as well.
  * **Shipping.** There is substantially less usage of #johnlock and #sherlolly in the #sherlocklives tag than the #sherlock tag overall.



For methodology, graphs, details, and updates, read on!

**METHODOLOGY**

**Fetching the data.**

I used the Tumblr API -- basically an interface that lets you write simple programs to retrieve post content and metadata for up to 20 Tumblr posts at a time.  I used the 'tagged' method to retrieve posts with the #sherlocklives tag (using the TumblPy Python library):

> postdata = client.tagged('sherlocklives', limit=20, before=timestamp)

I fetched the most recent 20 posts, then found the earliest post in that set, and fetched the 20 that immediately preceded that post's timestamp, and repeated until I ran out of posts.

 **Huge caveat:**  I ran into various [problems with missing posts](http://destinationtoast.tumblr.com/tagged/tumblr-api) when doing this analysis and a similar analysis on the #sherlock tag.  I am still trying to fully understand what's going on there, but it looks like Tumblr omits all #nsfw posts (or posts from blogs marked NSFW), as well as limiting the number of posts retrieved from a single blog in each batch of 20 posts.  Those issues probably won't affect overall statistics much.  However, when I tried to retrieve all posts in the #sherlock tag, I also found large gaps in the data -- entire days or weeks where no posts were returned, even though I know the post was being used.  I'm still trying to understand what's going wrong there, with the help of other awesome Tumblfolks like [annathecrow](http://annathecrow.corvidism.com/).  So please take the results I present here with a huge ~~grain~~  pile of salt, and keep in mind that there may be large numbers of posts tagged #sherlocklives which I did not successfully fetch from Tumblr and which might substantially change some of these stats.  

**Analyzing the tags.**

I decided to collapse all tags with different use of capitalization or whitespace into the same bin for this analysis.  In other words, #Benedict Cumberbatch, #benedict cumberbatch, and #benedictcumberbatch all get treated the same by this analysis.  Tumblr ignores capitalization but not whitespace, so Tumblr thinks the tags #benedict cumberbatch and #benedictcumberbatch are separate things.  But I was more interested in what people were talking about in general than in how they chose to use whitespace.  

I didn't remove duplicate tags from posts, though, so a post that was tagged both #benedict cumberbatch, and #benedictcumberbatch will get counted twice toward that tag.  (Oops -- should fix this in the future.)

**Analyzing the dates and times.**

I am not sure I got this part right.  I'm pretty sure that the timestamps Tumblr returns are UTC (GMT), and then I converted these to dates and hours using the Python datetime library.  However, there's some chance that I misunderstood something, and that the dates shown in these graphs are actually Pacific Time (my timezone). 

**Comparison to #sherlock tag.**

I also did an analysis of over 70,000 posts tagged #sherlock since Jan 2012.  I tried to analyze all posts tagged #sherlock from Tumblr, but there were [huge gaps](http://destinationtoast.tumblr.com/post/70347221499/tumblr-stats-potential-api-issue) in the data returned -- in fact, a whole lot of the posts that I used for my #sherlocklives analysis, which were also tagged #sherlock, were missing when I tried to grab all the #sherlock posts (e.g., Tumblr failed to return all posts from Nov 23-27 tagged #sherlock).  So my comparisons to the #sherlock tag should be taken with an even bigger pile of salt.

**RESULTS AND DISCUSSION**

**Activity in the #sherlocklives tag from Nov 22-Dec 15.**

The #sherlocklives tag has seen a bunch of activity most days since the BBC teaser trailer release.  There were at least 4570 posts that used this tag in the given date range -- and there may have been a bunch more.  There were at least 19,500 posts in the #sherlock tag in the same time frame, but a bunch of posts were obviously missing from that data set (see above methodology section).  I'm guessing based on huge amounts of extrapolation to try to fill in the missing data that around 3-10% of the #sherlock posts in this time frame were also tagged #sherlock lives.

Here's the timeline of activity for #sherlocklives:

[](https://photos.google.com/share/AF1QipNfIAP2nBhjNxva3s1oYWyv_nDghuCmNa8INN4wE96IKzjWYKtksd627_5d-6cisA/photo/AF1QipOKrgWfjxW54ZT_CI8rJibfa2izzp_bMucVSSei?key=UzBPVlZCQ3p1S1VnWHVkM2tDR3lQNGs2VmRBM3BR)

The three spikes correspond to:

>   1. 11/23 - Release of the 30 second Sherlock trailer with Sherlock's grave.... 
>   2. 11/29 -The day of the BBC hearse stunt in London on Friday morning. A lot of people tweeted and reblogged tweets and photos of the hearse and those chasing it.  
>   3. 12/8 - Release of BBC interactive trailer around 9pm UK time. 
> 


(Thanks to [penns-woods](http://penns-woods.tumblr.com/) for the above info, and [dudeufugly](http://dudeufugly.tumblr.com/) for corroborating!) There are also certain hours of the day that have seen a lot more activity (probably mostly on the same dates singled out above): 

[](https://photos.google.com/share/AF1QipNfIAP2nBhjNxva3s1oYWyv_nDghuCmNa8INN4wE96IKzjWYKtksd627_5d-6cisA/photo/AF1QipPa5Tx_bVqUQhBD5V-FsUcZjWT_ImdP6kuoLZre?key=UzBPVlZCQ3p1S1VnWHVkM2tDR3lQNGs2VmRBM3BR)

Despite the fact that it says "timezone = GMT", for the reasons described above (in Methodology), I think this graph could actually be showing Pacific time.  If so, the second peak on the graph would line up well with the timing of the BBC interactive trailer -- 1pm Pacific time is 9pm GMT.  Once I determine the timezone for sure, I'll update this post. 

**Types of posts made.**

The #sherlocklives tag has the following breakdown of post types, which is nearly the same as the #sherlock tag: 

[](https://photos.google.com/share/AF1QipNfIAP2nBhjNxva3s1oYWyv_nDghuCmNa8INN4wE96IKzjWYKtksd627_5d-6cisA/photo/AF1QipOjwCuJ4B23ZKRgksiYFvdvVm3QBVZKnZ6uEEYz?key=UzBPVlZCQ3p1S1VnWHVkM2tDR3lQNGs2VmRBM3BR)

Almost all text or photos posts.  This is probably similar to Tumblr posts overall, but I haven't checked that. 

**Tags used.**

The most common tag to co-occur with #sherlocklives is #sherlock, which was used on 70% of the #sherlocklives posts.  I've omitted that one from the following graph, which shows other top tags: 

[](https://photos.google.com/share/AF1QipNfIAP2nBhjNxva3s1oYWyv_nDghuCmNa8INN4wE96IKzjWYKtksd627_5d-6cisA/photo/AF1QipNaI6Gl_wEGjNfflN4GmfjLt14jRU-K4K7Ilv2s?key=UzBPVlZCQ3p1S1VnWHVkM2tDR3lQNGs2VmRBM3BR)

And here is the full set of top 50 tags: 

[](https://photos.google.com/share/AF1QipNfIAP2nBhjNxva3s1oYWyv_nDghuCmNa8INN4wE96IKzjWYKtksd627_5d-6cisA/photo/AF1QipMNji9FAeTivKX_sKy_wsgkbonHwkymyCOSgC59?key=UzBPVlZCQ3p1S1VnWHVkM2tDR3lQNGs2VmRBM3BR)

**Shipping.**

There are fewer posts tagged #johnlock or other ship names in the #sherlocklives tag than the #sherlock tag.  Just looking at #johnlock and #sherlolly (the top two shipnames), we see there is less than half as much activity for #johnlock in the #sherlocklives tag as the #sherlock tag (4.1% vs 9.1%), and about also less usage of #sherlolly in #sherlocklives than in #sherlock (0.35% vs. 0.53%): 

[](https://photos.google.com/share/AF1QipNfIAP2nBhjNxva3s1oYWyv_nDghuCmNa8INN4wE96IKzjWYKtksd627_5d-6cisA/photo/AF1QipN4Tk51QSSUYgge5Ff4dksVEl5DMxSBbqdrUkIN?key=UzBPVlZCQ3p1S1VnWHVkM2tDR3lQNGs2VmRBM3BR)

**FINAL THOUGHTS**

Excitement continues to grow in the #sherlock fandom, and a lot of that is directed to the #sherlocklives tag!  Every time the BBC makes an announcement about S3, the activity really spikes -- but it's been getting a bunch of usage the rest of the time, too.  I'll be continuing to explore what data I can get from Tumblr about the tag.   While we wait for S3, though, be careful as you explore the tags -- some folks have been posting major spoilers in the tags in the wake of the BFI screening.  But the [Three Patch Podcast blog](http://threepatchpodcast.tumblr.com/) has been reblogging lots of fun (and spoiler-free) uses of the #sherlocklives tag, if you want to check those out.

**Author's Note:**

> Comments welcome, but I’m in the middle of a massive fandom stats backup due to Tumblr purge, so I may be slow to respond.


End file.
